Part 1

The basics

Parisa Gregg & Myles Mitchell

Quarto extension - VS Code

  • Settings > Extensions

  • Search “quarto” in extensions search bar

  • Click the Quarto extension

  • Click “Install in pycon-quarto…”

Creating a new document - VS Code

  • File > New File… > Quarto Document (qmd)

  • Set title and output format

  • Click Render (or type Ctrl+Shift+K)

Creating a new document - VS Code

VS Code screenshot

Rendering with the command line

  • Terminal > New Terminal

  • Preview your document:

quarto preview my_doc.qmd
  • Render your document:
quarto render my_doc.qmd

YAML header

---
title: "A very cool title"
author: "Me"
date: "12 November 2022"
format: html
jupyter: python3
---
  • Output format includes html, pdf, docx
  • Use jupyter option to select the Jupyter kernel

Note

YAML: Yet Another Markup Language

Task 1: Your first Quarto document

  1. File > New File… > Quarto Document (qmd)

  2. Add a YAML declaring a title, author and HTML output

---
title: "A very cool title"
author: "Me"
format: html
---
  1. Click Render (or type Ctrl+Shift+K)
05:00

Including content

  • Text
  • Links
  • Images
  • Code
  • Embedded tables and plots
  • Equations
  • References

Quarto documentation

Font

Markdown Output
**bold** bold
__bold__ bold
*italic* italic
_italic_ italic
~~strikethrough~~ strikethrough
^superscript^ superscript
~subscript~ subscript

Bullet points (use -, + or *)

- Banana
- Apple
    - Pink lady
    + Royal gala
    * Granny Smith
- Pear
  • Banana
  • Apple
    • Pink lady
    • Royal gala
    • Granny Smith
  • Pear

Numbered lists

1. Banana
1. Apple
    - Pink lady
    - Royal gala
    - Granny Smith
1. Pear
  1. Banana
  2. Apple
    • Pink lady
    • Royal gala
    • Granny Smith
  3. Pear

Headings

# Heading level 1
## Heading level 2
### Heading level 3
#### Heading level 4
##### Heading level 5
###### Heading level 6

Task 2: Adelie Penguins 🐧

  1. Add the text from task02.txt to your Quarto doc

  2. Match the formatting (italics, bold, links) of the first sentence in the Adelie Penguin wiki

  3. Add an image of the Adelie Penguin (there’s one in the exercises folder)

  4. Can you add the penguin emoji to your text?

10:00

Including code

Code chunks

  • You can evaluate code!
  • Not limited in what code you can run
  • We could load data…
```{python}
import pandas as pd

penguins = pd.read_csv(
    'https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-07-28/penguins.csv'
)
```

Code chunks

  • Can have as many chunks as you like
```{python}
penguins
```
       species     island  bill_length_mm  ...  body_mass_g     sex  year
0       Adelie  Torgersen            39.1  ...       3750.0    male  2007
1       Adelie  Torgersen            39.5  ...       3800.0  female  2007
2       Adelie  Torgersen            40.3  ...       3250.0  female  2007
3       Adelie  Torgersen             NaN  ...          NaN     NaN  2007
4       Adelie  Torgersen            36.7  ...       3450.0  female  2007
..         ...        ...             ...  ...          ...     ...   ...
339  Chinstrap      Dream            55.8  ...       4000.0    male  2009
340  Chinstrap      Dream            43.5  ...       3400.0  female  2009
341  Chinstrap      Dream            49.6  ...       3775.0    male  2009
342  Chinstrap      Dream            50.8  ...       4100.0    male  2009
343  Chinstrap      Dream            50.2  ...       3775.0  female  2009

[344 rows x 8 columns]

Code chunks

  • Order matters!
x
Error in py_call_impl(callable, dots$args, dots$keywords): NameError: name 'x' is not defined
x = 5
x
5

Chunk options

  • Control properties of the code within the chunk and it’s outputs

  • Controlled using YAML within the code chunks

  • Loads of options

Chunk options

Option Purpose Default value
echo Show/hide code chunks in the output true
eval Whether to evaluate code within the chunk true
warning Show/hide messages/warnings produced by code in the output true
error Allow the code to error but show the error in the output false

Chunk options: echo

Use to show / hide code

#| echo: false
penguins

produces

       species     island  bill_length_mm  ...  body_mass_g     sex  year
0       Adelie  Torgersen            39.1  ...       3750.0    male  2007
1       Adelie  Torgersen            39.5  ...       3800.0  female  2007
2       Adelie  Torgersen            40.3  ...       3250.0  female  2007
3       Adelie  Torgersen             NaN  ...          NaN     NaN  2007
4       Adelie  Torgersen            36.7  ...       3450.0  female  2007
..         ...        ...             ...  ...          ...     ...   ...
339  Chinstrap      Dream            55.8  ...       4000.0    male  2009
340  Chinstrap      Dream            43.5  ...       3400.0  female  2009
341  Chinstrap      Dream            49.6  ...       3775.0    male  2009
342  Chinstrap      Dream            50.8  ...       4100.0    male  2009
343  Chinstrap      Dream            50.2  ...       3775.0  female  2009

[344 rows x 8 columns]

Chunk options: warning

Show / hide warnings produced by code in the output

penguins.lookup([0], ["island"])
array(['Torgersen'], dtype=object)

<string>:1: FutureWarning: The 'lookup' method is deprecated and will be removed in a future version. You can use DataFrame.melt and DataFrame.loc as a substitute.

Not anymore…

```{python}
#| warning: false
penguins.lookup([0], ["island"])
```
array(['Torgersen'], dtype=object)

Code collapsing

#| echo: fenced
#| code-fold: true
Code
```{python}
#| code-fold: true
import numpy as np
import pandas as pd
```

Task 3: Running some code! 🐧

  1. Open the document task03.qmd

  2. Under the analysis subheading, add a code chunk to import Pandas

  3. Hide the code chunk with #| echo: false

  4. Add another code chunk to read in the penguins data (link in doc) and display it i.e.

    penguins = pd.read_csv("link_to_penguin_data")
  5. Make the data loading chunk a collapsed code chunk with #| echo: fenced and #| code-fold: true

10:00

Graphs

```{python}
#| eval: false
#| fig-cap: "Flipper vs bill length of penguin species across three years"
#| fig-width: 8
import plotly.express as px

px.scatter(
    penguins,
    x="bill_length_mm",
    y="flipper_length_mm",
    color="species",
    facet_col="year",
    trendline="ols",
)
```

Graphs

Flipper vs bill length of penguin species across three years

Tables

  • Markdown syntax
| fruit  | count  | color  |
|--------|--------|--------|
| banana | 5      | yellow |
| apple  | 6      | red    |
| pear   | 2      | green  |
fruit count color
banana 5 yellow
apple 6 red
pear 2 green

Tables

  • Convert python data to markdown
```{python}
#| eval: false
#| tbl-cap: "Table of fruits"
from IPython.display import Markdown
from tabulate import tabulate

table = [
    ["banana", 5, "yellow"],
    ["apple", 6, "red"],
    ["pear", 2, "green"],
]
Markdown(
    tabulate(table, headers=["fruit", "count", "color"])
)
```

Tables

Table of fruits
fruit count color
banana 5 yellow
apple 6 red
pear 2 green

Tables

  • Pandas DataFrame
penguins

Tables

       species     island  bill_length_mm  ...  body_mass_g     sex  year
0       Adelie  Torgersen            39.1  ...       3750.0    male  2007
1       Adelie  Torgersen            39.5  ...       3800.0  female  2007
2       Adelie  Torgersen            40.3  ...       3250.0  female  2007
3       Adelie  Torgersen             NaN  ...          NaN     NaN  2007
4       Adelie  Torgersen            36.7  ...       3450.0  female  2007
..         ...        ...             ...  ...          ...     ...   ...
339  Chinstrap      Dream            55.8  ...       4000.0    male  2009
340  Chinstrap      Dream            43.5  ...       3400.0  female  2009
341  Chinstrap      Dream            49.6  ...       3775.0    male  2009
342  Chinstrap      Dream            50.8  ...       4100.0    male  2009
343  Chinstrap      Dream            50.2  ...       3775.0  female  2009

[344 rows x 8 columns]

Tables

  • Convert DataFrame to markdown
Markdown(
    tabulate(penguins.head(), headers="keys", showindex=False)
)

Tables

species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
Adelie Torgersen 39.1 18.7 181 3750 male 2007
Adelie Torgersen 39.5 17.4 186 3800 female 2007
Adelie Torgersen 40.3 18 195 3250 female 2007
Adelie Torgersen nan nan nan nan nan 2007
Adelie Torgersen 36.7 19.3 193 3450 female 2007

Tables

  • Convert DataFrame to markdown
species_avg = penguins.groupby("species").mean(numeric_only=True)
Markdown(
    tabulate(species_avg, headers="keys")
)

Tables

species bill_length_mm bill_depth_mm flipper_length_mm body_mass_g year
Adelie 38.7914 18.3464 189.954 3700.66 2008.01
Chinstrap 48.8338 18.4206 195.824 3733.09 2007.97
Gentoo 47.5049 14.9821 217.187 5076.02 2008.08

Task 4: Tables and graphs 🐧

  1. Open the document task04.qmd

  2. Add a table displaying the first five rows of the data (use the empty code chunk provided).

  3. Add a plot showing the distribution of bill length for each sex and species. Use the plotting code below:

px.histogram(
    penguins,
    x="bill_length_mm",
    color="sex",
    facet_row="species",
)
  1. Add a caption to the plot using the fig-cap code chunk option.
10:00